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Robust checksums 



The invention relates to a method of analyzing the correctness of an output 
signal, the output signal being obtained from transformation of an input signal. In particular 
the invention relates to a method of checking the correct operation of a lossy transformation. 
In such transformation, parts of the signal are deleted in a signal-theoretical sense, but, for 
5 human perception, the signal remains substantially unchanged. The invention also relates to a 
signal analyzer, more particularly, to a receiver and/or to a transmitter adopting the method of 
the invention. 

Lossy transformations are to be seen in contrast to lossless transformations, 

O 

D such as, for example, a lossless compression, or other forms of lossless data encoding. In a 

Ti 1 0 lossless transformation of a signal, there remains a one-to-one relation between an input 

'^3 signal and an output signal, or, alternatively, in a transmission of a signal, there remains a 

w 

u one-to-one relation between a transmitted and a received signal. In this respect, a lossless 

f =; encoder provides an encoded signal that is, after decoding, bit by bit identical with its input 

h'= signal. So, for such coding transformations, that is, transformations, where a signal is 

Li 

\n 15 encoded and in a later stage decoded, it is possible to add verification means to the data to 
J ensure data integrity during the transformations. Such verification is necessary, since 

received data may be erroneous owing to noise or damage. It is also possible that in the 
decoding step, after reception of the signal, errors are introduced due to hardware or software 
defects. The mere untreated transmission of erroneous signals may of course lead to annoying 
20 or even intolerable effects like, for instance, in audio systems, too high noise levels. 

One way of verifying the correctness of the received data is as follows: in the 
transmitter a checksum is derived and added to the data. In the receiver again a checksum is 
derived and compared with the checksum as received from the transmitter. If the two 
checksums are identical, the transmission is assumed to be correct, and if the two checksums 
25 differ, the received data is assumed to be erroneous. 

Equality of both checksums implies, with a large probability, that the received 
data is bit by bit identical with the transmitted data. Small distortions of the data will cause 
the checksums to be different. If a checksum is different, a correction scheme can be 
followed in outputting the data, for example, the data can be retransmitted, muted or 
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interpolated. In this way, the outputting of erroneous signals is prevented or at least treated in 
an acceptable manner. 

In a lossy encoder, the above described method can not be applied. In a lossy 
encoded transmission, parts of the signal will be lost, hence the term lossy. Thus, even vmder 
normal conditions, although the difference is perceptually not relevant, there is no bit-by-bit 
accurate mapping between input and output signal. Therefore, checksums of data on the 
transmitter side will differ from the data on the receiver side, so a differing checksum is not 
an indication of an erroneous transformation of the signal. 

The invention aims to overcome this problem and provide a method to check 
the correct operation of a signal transformation, wherein, even in a lossy transformation, a 
robust verification can be performed, to ensure data integrity during transformations. In this 
respect, the term robust is introduced, to identify a verification procedure which is, up to a 
certain extent, invariant to data processing (as long as the processing retains an acceptable 
quality of the content). In this way, the correctness of a signal transformation can be assessed, 
even if the transformation is lossy, like for instance in compression algorithms, wherein large 
parts of the signal are deleted because they are not relevant to human perception. 

Accordingly, the method of the invention comprises the steps of: 

receiving a first robust feature derived from an input signal, wherein the input 
signal has been transformed into the output signal by the signal transformation; 

deriving a second robust feature from the output signal; and 

identifying a degree of similarity between said first robust feature and said 
second robust feature. 

In a further embodiment, the method may comprise the step of correcting the 
output signal into a corrected signal, in dependence on said degree of similarity. 

The method according to the invention is especially applicable in the field of 
datatransmission, where data (usually in a compressed format) are transmitted in association 
with their robust features. So, in a preferred embodiment, the method of the invention 
comprises the steps of: encoding the input signal into an encoded signal, and transmitting the 
encoded signal and the first robust feature. 

The method may also comprise receiving an encoded signal, and decoding 
said encoded signal into an output signal. 

Although the robust feature may be sent in a separate channel, in a special 
embodiment of the invention, the method comprises the step of embedding the first robust 
feature into the encoded signal through watermark technology. 
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A preferred way of deriving a robust feature from each of said input and 
output signals is by splitting an information signal into successive time intervals, and 
- computing a hash value from a scalar property or vector of properties of the information 

signal within each tune interval. 
5 In a still further preferred embodiment, deriving a robust feature from each of 

said input and output signal comprises: transforming the information signal within the time 
interval into disjoint bands, calculating a property of the signal in each of said bands, 
comparing the properties in the bands with respective thresholds, and representing the results 
of said comparisons by respective bits of the hash (sample) value. 
1 0 Said bands may be frequency bands having an increasing bandwidth as a 

function of the frequency. Said property may be the energy of a band; said property may also 
be the tonality of a band. Other bands and properties are also feasible. 

Although the method can be applied to any kind of transformation, the method 
is advantageously applied when the transformation is a lossy transformation. 
1 5 In one specific preferred embodiment, the method comprises: 

a) calculating from the input signal a first block of subsequent hash values 
corresponding to a first time interval; 

b) calculating from the output signal a second block of subsequent hash values 
corresponding to a second time interval, at least partially overlapping said first interval; 

20 c) selecting one hash value from one of said first and second blocks of hash 

values; 

d) searching for said hash value in the other one of said first and second blocks of 
hash values; 

e) calculating a difference between the first and second blocks of hash values in 
25 which the hash value found in step (d) has the same position as the selected hash v£ilue in the 

other one of said first and second blocks; 

f) repeating steps (c)-(e) for a frirther selected hash value until said difference is 
lower than a predetermined threshold, or until the number of hash values to be selected is 
lower than a predetermined threshold; 

30 g) concluding a correct operation of said signal transformation if the difference is 

lower than a predetermined threshold or concluding to a false operation of said signal 
transformation if the nvmiber of hash values to be selected is lower than a predetermined 
threshold. 
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The latter embodiment is particularly preferable, in case no fixed frame 
boundaries are present in the signal. 

In this embodiment, the further selected hash value may be another hash value 
of the first block of hash values. Alternatively, the further selected hash value may be 
obtained by reversing a bit of the previously selected hash value. In a still further 
embodiment, the method comprises the steps of receiving information indicative of the 
reliability of the bits of the selected hash value, and using said information to determine 
whether or not to use the selected hash value. Alternatively, the method may further comprise 
the steps of receiving information indicative of the reliability of the bits of the selected hash 
value, and using said information to determine the bit to be reversed. 

The invention also relates to a receiver, comprising: receiving means for 
receiving a first robust feature derived from an input signal, wherein the input signal has been 
transformed into the output signal by the signal transformation; 

analysing means for deriving a second robust feature from the output signal; 

and 

comparing means for identifying a degree of similarity between said robust 
feature and a second robust feature derived from an input signal. 

The receiver may be a radio, television, computer or any other device 
receiving such signals together with their robust features, but it may also be a microcircuit or 
part of a circxait receiving said signals. 

In one embodiment the receiver comprises correcting means responsive to the 
comparing means, for correcting the output signal into a corrected signal. 

In a further embodiment, the receiver receives an encoded signal from a 
transmitter the receiver further comprising: decoding means for transforming the encoded 
signal into an output signal. 

The invention also relates to a transmitter, suitable for transmitting encoded 
signals to be received by said receiver, the transmitter comprising: analyzing means for 
deriving a first robust feature from an input signal; 

encoder means for encoding the input signal into an encoded signal; and 

fransmitting means for transmitting the encoded signal and the first robust 

feature. 

The invention also relates to a data carrier comprising a data chamiel 
corresponding to a multimedia signal and a data channel corresponding to a robust feature 
associated to said multimedia signal. 
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Further objects and features of the invention will become apparent from the 
drawings, wherein, 

Fig. 1 shows an illustration of a lossless encoding process; 

Fig. 2 shows an illustration of deriving a robust feature from a signal; 

Fig. 3 shows an illustration comparing the robust features of input and output 

signal; and 

Fig. 4 shows a schematic embodiment of a transmitter fransmitting data to a 
receiver, wherein the method according to the invention is applied. 

In the drawings, like or the same parts are referenced by the same numerals. 

The lossless process, illustrated schematically in Fig. 1 by transformation 
channel 1, is illustrative for a newly developed high-quality audio system for consumer 
application: Super Audio CD or SACD, although the process may be applied in other areas of 
technology as well, such as, for instance, video or other multimedia signal processing. The 
transmission chaimel 1 consists among others of a lossless encoder 2, a disk 3 on which a 
signal encoded by the encoder 2 is stored, and a lossless decoder 4. To check the digital audio 
signals from begin (input signal 5) to end (output signal 6), checksums 7, 8, respectively, are 
introduced. At playback a comparator 9 compares on a frame-by-frame basis the checksum 7 
of the input signal 5 before the lossless encoder 2 with the checksum 8 of the output signal 6 
after the lossless decoder 4. It is then possible to detect errors in the lossless 
encoding/decoding transformation, because an error will cause a difference between the two 
checksums. In case of an error in the output signal 6, a corrector 10 will mute the signal, or 
produce an otherwise corrected signal 60. A reason to check for an error in the coding system 
is that such an error may result in high-level noise signals, which are at least annoying. 

As is apparent from Fig. 1, the process of checking the correct operation of 
encoding/decoding transformation is not suitable in case the audio encoder 2 is a lossy 
encoder instead of a lossless encoder. Lossy means that, in a signal-theoretic sense, there is a 
difference between the input and the output signal, but that such a difference is perceptually 
not relevant. This implies that even under normal conditions the input 5 and output signal 6 
are not accurate bit-by-bit; therefore a checksiun cannot be used, since such checksums, even 
when the transformation is performed correctly, would not match. Of course, also in a lossy 
encoding process, some parameters are transmitted losslessly, so that some checksum on 
these intermediate resuhs can be used, but a begin-to-end check is not possible. 
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In a transmission channel 1 according to Fig 1, when audio encoder 2 is a 
lossy encoder, input and output signals may differ quite drastically (e.g. by 
compression/decompression). Yet, the human perceptual system (HPS) has no problem in 
recognizing the 'sameness'. If the HPS considers the input and output signals "the same", a 
hash function should also produce substantially the same hash signal, that is, that from a 
degree of similarity between the respective hash signals, a degree of "sameness" of the 
signals can be derived. In this respect, a hash function should not only be able to identify the 
content, but should also be able to identify time (intervals). For this reason the following 
definition for a robust hash is herein used: 

A robust hash is a function that associates to every basic time-unit of audio 
content a semi-unique bit-sequence that is continuous with respect to content similarity as 
perceived by the HPS. 

In other words, if the HPS identifies two signals as being very similar, the 
associated hash values should also be very similar. In particular, if we compute the hash 
values for original content and transformed content, the hash values should be similar. On the 
other hand, if two signals really represent different content, the robust hash should be able to 
distinguish the two signals (semi-unique). The required robustness of the hashing function is 
achieved by deriving the hash function from robust features (properties), i.e. features which 
are largely invariant to processing. 

Fig. 2 shows a schematic diagram of an arrangement for generating a robust 
feature from an input signal. The signal, in the example of Fig. 2 being an audio signal 5, is 
first downsampled in a downsampler 1 1 to reduce the complexity of subsequent operations 
and restrict the operation to a frequency range from 300-3000 Hz, which is most relevant for 
the human auditory system (HAS). 

In a framing circuit 12, the audio signal is divided into frames with an overlap 
factor of 31/32. The overlap is chosen in such a way to ensure a high correlation of the hash 
values between subsequent frames. The spectral representation of every frame is computed 
by a Fourier transform circuit 13. In the next block 14, the absolute value of the (complex) 
Fourier coefficients is computed. 

A band division stage 15 divides the spectrum into a number (e.g. 33) of 
bands. In Fig. 2, this is schematically shown by selectors 151, each of which selects the 
Fourier coefficients of the respective band. In a preferred embodiment of the arrangement, 
the bands have a logarithmic spacing, because the HAS also operates on approximately 
logarithmic bands. By choosing the bands in this manner, the hash value will be less 
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susceptible to processing changes such as compression and filtering. In the preferred 
embodiment the first band starts at 300Hz and every band has a bandwidth of one musical 
tone (i.e. the bandwidth increases by a factor of 21/12«1 .06 per band). 

Next, for every band a certain (not necessarily scalar) characteristic property is 
5 calculated. Examples of properties are energy, tonality and standard deviation of the power 
spectral density. In general the chosen property can be an arbitrary function of the Fourier 
coefficients. Experimentally it has been verified that the energy of every band is a property 
that is most robust to many kinds of processing. This energy computation is carried out in an 
energy computing stage 16. For each band it comprises a stage which computes the sum of 
10 the absolute values of the Fourier coefficients in the band. 

In order to get a binary hash value, the robust properties are subsequently 
converted into bits. The bits can be assigned by calculating an arbitrary function of the robust 



? properties of possibly different frames and then comparing it to a threshold value. The 

□ threshold itself might also be a result of another function of the robust property values. 

15 ' In the present arrangement, a bit derivation circuit 17 converts the energy 

O levels of the bands into a binary hash value. In a simple embodiment, the bit derivation stage 

iy 

Li generates one bit for each band, for example, a ' 1 ' if the energy level is above a threshold and 

a '0' if the energy level is below said threshold. The thresholds may vary from band to band. 
==-% Alternatively, a band is assigned a hash value bit ' 1 ' if its energy level is larger than the 

"1 20 energy level of its neighbor, otherwise the hash value bit is '0'. The present embodiment uses 
an even improved version of the latter alternative. To avoid that a major single frequency in 



the audio signal would produce identical hash values for successive frames, variations of the 
amplitude over time are also taken into accoimt. More particularly, a band is assigned a hash 
value bit T if its energy level is larger than the energy level of its neighbor and if that was 

25 also the case in the previous frame, otherwise the hash value bit is '0'. The specific form of 
the hash function may vary for different embodiments. 

To this end, the bit derivation circuit 17 comprises for each band a first 
subtractor 171, a frame delay 172, a second subtractor 173, and a comparator 174. The 33 
energy levels of the spectrum of an audio frame are thus converted into a 32-bit hash value 

30 H(n.m.). The hash values of successive frames are finally stored in a buffer 1 8, which is 
accessible by a computer 19. 

In Fig. 3 is illustrated how a hash signal as derived from the input signal 
shown in Fig. 2, is compared with another hash signal as derived in a similar manner as 
shown in Fig. 2 for an output signal. In this respect, two blocks 20 and 21, corresponding to 
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the input signal and the output signal, respectively, of "robust" hash values are present, with 
overlapping time intervals. 

In a first embodiment of the matching method, it will be assumed that every 
now and then a single hash value has no bit errors. A single hash value is selected from the 
5 first hash block 20 and matched with a hash value of the second hash block 21 . Initially, the 
selected hash value will be the last hash value of the first hash block 20. In the example 
shown in Fig. 3, this is the hash value 0x00000001. Let us say that this hash value is present 
on position p, which, as can be seen from the Figure, apparently does not correspond to the 
right position. In a further step the computer calculates the bit error rate (BER, defined as the 

1 0 ratio of the number of erroneous bits and the total number of bits) between hashes of the first 
hash block and hashes of the second block of hash values present on position 0 up to position 
p. In a fiirther step is checked whether the BER is low (<0.25) or high. If the BER is low, the 
probability is high that the two hash blocks match, in which case it is concluded that the 
signal transformation has been performed correctly. If the BER is high, either the signal 

1 5 transformation has not been performed correctly, or the previously selected single hash value 
contains an error. The latter will be assumed to be the case in this example. Another single 
hash value is then selected, for instance, as illustrated in Fig. 3 the last but one single hash 
value. This hash value appears to occur in the second block, apparently, as is shown in the 
Figure, on the right position. If the BER between input block hash and output block appears 

20 to be lower than for example 0.25, it is concluded to a correct operation of said signal 
transformation. 

The computer thus only looks at one single hash value at a time and assumes 
that every now and then such a single hash value has no bit errors. The BER of the extracted 
hash block is then compared with the (on the time axis) corresponding hash blocks. If the 

25 BER is below the threshold it will be concluded that the signal was transformed correctly, 
otherwise another single hash value will then be tried. If none of the single hash values leads 
to success, a false operation of said signal transformation will be concluded. 

The above described method relies on the assmnption that every now and then 
an extracted hash value has no bit errors, i.e. is perfectly equal to the corresponding stored 

30 hash value. However, it is unlikely that hash values without any bit errors occur when the 
signal is severely processed. Another embodiment of the matching method uses soft 
information of the hash extraction algorithm to find the extracted hash values in the database. 
By soft information is meant the reliability of a bit, or the probability that a hash bit has been 
retrieved correctly. In this embodiment, the arrangement for extracting the hash values 
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includes a bit reliability determining circuit 22 (see Fig. 2). This circuit receives the 
differential energy band levels in the form of real numbers. If a real number is very close to 
. the threshold (which is zero in this example), the respective hash bit is unreliable. If, instead, 
the number is very far from the threshold, it is a reliable hash bit. The bit reliability 
5 determining circuit 22 of Fig. 2 derives the reliability of every hash bit, and thus enables the 
computer 19 to generate a list of most probable alternative hash values for each hash value. 
By assimiing again that at least one of the alternative hash values is correct, the two hash 
blocks can be matched. 

Although such method according to the invention may be applied in a single 
1 0 electronic signal processing apparatus, an illustrative application of the method is depicted by 
Fig. 4 where dotted lines 23, 24 and 25, illustrate a transmitter, a data carrier and a receiver, 
respectively. The transmitter 23 is, for example, a multimedia signal transmitter, transmitting 
audio, video, speech, graphic images and the like. The transmission can be a wireless 
transmission, or a transmission over the Intemet, in fact, any kind of transmission. The 
1 5 transmission can also be done via a physical data carrier, such as a magnetic disk or a CD- 
rom etc. 

The transmitter comprises analyzing means 71 for deriving a first robust 
feature 72 from an input signal 51; encoder means 2 for encoding the mput signal 51 into an 
encoded signal 61 ; and transmitting means 26 for transmitting the encoded signal 61 and the 
20 first robust feature 72. The analysing means 71 were explained with reference to Fig. 2 and 
may be embedded in hardware, software etc. The same applies for the encoder means 2, 
which may be general purpose compression software or any kind of dedicated encoding tool. 
Further, the transmitting means 26 may, for example, be a radio or tv- transmitter or a remote 
server on the Intemet. 

25 The data carrier 24 comprises a data channel 27 corresponding to the 

multimedia signal 61 and a data channel 28 corresponding to the robust feature 72 associated 
to multimedia signal 51. Obviously, the data carrier 24 may be a physical carrier, such as a 
magnetic disk or a CD-rom etc. but it may also be for mstance an electromagnetic signal, that 
is broadcast through the air or via a physical network. 

30 The receiver 25, which will be, for example, a television set, a CD-player or a 

multimedia computer, comprises a combination of receiving means 29 for receiving the first 
robust feature 72; analysing means 81 for deriving a second robust feature 82 from the output 
signal; and comparing means 91 for identifying a degree of similarity between said robust 
feature and a second robust feature 72. The receiver further has correcting means 101 
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responsive to the comparing means 91, for correcting the output signal 61 into a corrected 
signal 62. The receiving means 29 may be any kind of adequate readout means for picking up 
the data channels 27 and 28 of the data carrier 24, such as, for instance, an antenna, a modem 
or a magnetic or optical reading unit. 

It will be clear to those skilled in the art that the invention is not limited to the 
embodiment described with reference to the drawing, but may comprise all kinds of 
variations thereof. Such variations are deemed to fall within the scope of protection of the 
appended claims. 
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Reference numbers: 

1 . transformation channel 

2. lossless encoder 
5 3. disk 

4. decoder 

5. input signal 

6. output signal 

7. checksums 
10 8. checksums 

9. comparator 

10. corrector 

I* 11. downsampler 

Q 12. framing circuit 

15 13. circuit 

C3 14. block 

W 

i,^, 15. stage 

16. energy computing stage 
M 17. circuit 

fj 20 IS.bufSer 
O 19. computer 

20. input block 

21. output block 

22. circuit 

25 23. transmitter 

24. data carrier 

25. receiver 

26. transmitting means 

27. data channel 
30 28. data channel 

29. receiving means 



